perm filename PATREC[4,KMC]1 blob
sn#073132 filedate 1973-11-21 generic text, type T, neo UTF8
00100 A PATTERN RECOGNITION ALGORITHM WHICH CHARACTERIZES NATURAL LANGUAGE
00200 DIALOGUE EXPRESSIONS
00300
00400
00500
00600 COLBY AND PARKISON
00700
00800 OUTLINE
00900 INTRODUCTORY -Discussion of language as code, other approaches
01000 sentence versus word dictionary using projection
01100 rules to yield an interpretation from word definitions.
01200 experience with old Parry.
01300 PROBLEMS -dialogue problems and methods. Constraints. Special cases.
01400 Pattern recognition
01500 Preprocessing- dict words only
01600 translations
01700 contractions
01800 expansions
01900 synonyms
02000 negation
02100 Segmenting - prepositions, wh-words, meta-verbs
02200 give list
02300 Matching - simple and compound patterns
02400 association with semantic functions
02500 first coarsening - drop fillers- give list
02600 second coarsening - drop one word at a time
02700 dangers of insertion and restoration
02710 Recycle condition- sometimes a pattern containing pronouns
02720 is matched, like "DO YOU AVOID THEM". If THEM could be
02730 a number of different things and Parry's answer depends on
02740 which one it is, then the current value of the anaphora,
02750 THEM, is substituted for THEM and the resulting pattern
02760 is looked up. Hopefully, this will produce a match to a
02770 more specific pattern, like "DO YOU AVOID MAFIA".
02800 default condition - pass surface to memory
02900 change topic or level
03000 Advantages - real-time performance, pragmatic adequacy and
03100 effectiveness, performance measures.
03200 "learning" by adding patterns
03300 PARRY1 ignored word order- penalty too great
03400 PARRY1 too sequential taking first pattern it found
03500 rather than looking at whole input and then deciding.
03600 PARRY1 had patterns strung out throughout procedures
03700 and thus cumbersome for programmer to see what patterns were.
03800 Limitations - typical failures, possible remedies
03900 Summary
04000
04100
04200 We assume anyone who has read this far knows what pattern
04300 pattern recognition is. By "characterize" we mean a multi-stage
04400 sequence of functions which transforms natural language input
04500 expressions into a pattern which best matches a stored pattern whose
04600 name has a pointer to the name of a response function. Response
04700 functions decide what to do once the input has been characterized.
04800 Here we shall discuss only the characterizing functions, reserving a
04900 description of the response functions for another communication.
05000 In constructing and testing a simulation of paranoid
05100 pocesses, we were faced with the problem of reproducing paranoid
05200 linguistic behavior in a psychiatric interview. The diagnosis of
05300 paranoid states or reactions or modes is made by clinicians who judge
05400 a degree of correspondence between what they observe and their
05500 conceptual model of paranoid behavior. There exists a high degree of
05600 agreement about this conceptual model which relies mainly on what an
05700 interviewee says and how he says it.
05800 Natural language is a code people use for communication. In a
05900 dialogue such as a psychiatric interview, the participants have
06000 intentions and expectations which are revealed in their linguistic
06100 expressions. To produce the effect of having an interviewer
06200 experience a dialogue he would judge paranoid, a simulation of a
06300 paranoid patient must be able to demonstrate typical paranoid
06400 interview behavior. Thus is must have the ability to deal with the
06500 linguistic behavior of the interviewer adequate to achieve the
06600 desired effects.
06700 There are a number of approaches one could take in handling
06800 dialogue expressions. One approach would be to have a sentence
06900 dictionary of all expressions which might come up in an interview.
07000 Associated with each sentence would be its interpretations depending
07100 on context. No one as yet has taken this approach seriously. Instead
07200 of a sentence dictionary, one might construct a word dictionary and
07300 then use projections rules to yield an interpretation of a sentence
07400 from the dictionary definitions. This, for example, has been the
07500 approach of Winograd [ ] and Woods [ ]. Such a method works as long
07600 as the dictionary remains relatively small, each word has only one or
07700 two senses, and the dialogue is limited to a small world of objects
07800 and relations. But the problems which arise in a psychiatric
07900 interview in unresrticted English are too great for this method to be
08000 useful.
08100 We have developed another approach involving pattern
08200 recognition which is successful a high percentage of the time. (No
08300 one expects an algorithm to be successful 100% of the time since not
08400 even humans, the best natural language system around, achieve this
08500 level of performance). The main power of a pattern recognition
08600 approach lies in its ability to ignore unintelligble expressions and
08700 irrelevant details. A conventional parser doing word-by-word analysis
08800 fails when it cannot find one of the input words in its dictionary.
08900 It must know, it cannot guess.
09000 In early versions of the paranoid model (PARRY1) the
09100 pattern recognition mechanisms were quite crude. For example,
09200 consider the following expressions:
09300 (1) WHERE DO YOU WORK?
09400 (2) WHAT SORT OF WORK DO YOU ?
09500 (3) WHAT IS YOUR OCCUPATION ?
09600 (4) WHAT DO YOU DO FOR A LIVING ?
09700 (5) WHERE ARE YOU EMPLOYED ?
09800 In PARRY1 a procedure would scan these expressions looking for an
09900 "occupation" contentive such as "work", "for a living", etc. If it
10000 found such a contentive along with a "you" or "your" in the
10100 expression, regardless of word order, it would respond to the
10200 expression as if it were a question about where he works. (There is
10300 some doubt this even qualifies as a pattern since interrelations
10400 between words are ignored and only their presence is considered). An
10500 insensitivity to word order has the advantage that different parts of
10600 speech can represent the same concept,e.g. "work" as noun or as
10700 verb. But we found from experience that the risk, the mean penalty of
10800 errors, was too great. Hence in PARRY1 , as will be described in
10900 detail, we utilized patterns which required a specified word order.
11100 As everyone who deals with hyper-complex problems knows, it
11200 is useful to have constraints. A psychiatric interview has several
11300 constraints. Clinicians are trained to ask certain questions in
11400 certain ways. These stereotypes can be treated as special cases. Only
11500 a few hundred topics are commonly brought up by interviewers. When
11600 the interview is conducted over teletypes, short expressions are used
11700 since the interviewer tries to increase the information transmission
11800 rate over the slow channel of a teletype. (It is said that short
11900 expressions tend to be more grammatical but think about the phrase
12000 "Now now, there there.") Teletype interviews involve written speech.
12100 This speech is full of idioms, cliches, pat phrases, etc. - all being
12200 easy prey for a pattern recognition approach. It is hopeless to try
12300 to decode an idiom by analyzing the meanings of its individual words.
12400 One knows what an idiom refers to or one does not.
12500 We shall describe the alogorithm in three sections
12600 devoted to preprocessing, segmenting, and matching.
12700
12800 PREPROCESSING
12900
13000 Each word in the input expression is first looked up in a
13100 dictionary of 1240 (right) words. If a word in the input is not in the
13200 dictionary, it is dropped from the pattern being formed. Thus if the
13300 input were:
13400 WHAT IS YOUR CURRENT OCCUPATION?
13500 and the word "current" is not in the dictionary, the pattern at this
13600 phase becomes:
13700 (what is your occupation ?) Synonymic translations of some
13800 words are made so that the pattern becomes, for example:
13900 (what be your job ?)
14000 Groups of words are translated into a single word so that, for example,
14100 "for a living" becomes "job".
14200 Certain juxtaposed words are made into a single word,e.g.
14300 "get along with" becomes "getalongwith". This is done (1) to deal
14400 with idioms and (2) to prevent the segmenter, soon to be described,
14500 from segmenting at the wrong places. Besides contractions, expansions
14600 are made so that for example, "I dont" becomes "I do not" and Id
14700 becomes "I would".
14800 Negations are handled by extracting the "not" from the pattern
14900 and keeping it aside for future reference. (Roger expands this here?)
14910 Some patterns have a pointer to a pattern of opposite meaning
14920 if a "not" could reverse their meanings. If this pointer is present
14930 and a "not" is found, then the pattern matched is replaced by its
14940 opposite.
15000
15100 SEGMENTING
15200
15400 Using the list of words in Fig.1, the algorithm next breaks
15500 the pattern into fragments or segments which are moree tractable
15600 than entire input expressions. The new pattern formed is simple, having
15700 no delimiters within it, or compound, i.e.being made up of two or
15800 more simple patterns. A simple pattern might be:
15900 (what be you job ?)
16000 (note to Roger -- is the "?" in this position?
16010 (reply from Roger -- the "?" is thrown away in the first scan.
16020 the word order is enough to indicate when a question has been
16030 asked.)
16100 whereas a compound pattern would be:
16200 (give good examples)
16300 It is worth noting that after certain verbs ("think", "feel",etc)
16400 a bracketing occurs to replace the commonly omitted "that".
16500
16600 MATCHING
16700
16900 The algorithm now attempts to match the segmented patterns
17000 with stored patterns. First a complete and perfect match is sought.
17100 When a match is found, the stored pattern name has a pointer to a
17200 name of a response function which decides what to do further. If a
17300 match is not found, a coarsening of the pattern is carried ought
17400 by dropping all the "filler" words listed in Fig. 2. This list derived
17500 from empirical experience with thousands of interviews carried out
17600 with PARRY1 by clinicians and by graduate students.
17700 If a match is not found, the contentive words in the pattern
17800 are dropped one at a time and a match attempted each time. (GIVE
17850 MY NAME IS DR LOVE.
17900 GOOD EXAMPLES). If no match can be found at this point, the algorithm
18000 has arrived at a default condition which the appropriate response
18100 functions decide what to do. In a default condition, the model assumes
18200 control of the interview, asking the interviewer a question, continuing
18300 with the topic under discussion or introducing a new topic.